Geodesic Distance-based Kernel Construction for Gaussian Process Value Function Approximation

نویسنده

  • HUNOR JAKAB
چکیده

Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model confidence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of constructing geodesic distance based kernel functions from the Markov Decision process induced graph structure. Using sparse on-line Gaussian process regression the nodes and edges of the graph structure are allocated during on-line learning parallel with the inclusion of new measurements to the basis vector set. This results in a more compact and efficient graph structure and more accurate value function estimates. The approximation accuracy is tested on a simulated robotic control task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robot Control by Least-Squares Policy Iteration with Geodesic Gaussian Kernels

The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. To overcome this problem, new basis functions called geo...

متن کامل

Geodesic Gaussian kernels for value function approximation

The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on ...

متن کامل

Manifold-based non-parametric learning of action-value functions

Finding good approximations to state-action value functions is a central problem in model-free on-line reinforcement learning. The use of non-parametric function approximators enables us to simultaneously represent model and confidence. Since Q functions are usually discontinuous, we present a novel Gaussian process (GP) kernel function to cope with discontinuity. We use a manifold-based distan...

متن کامل

Entropy of Overcomplete Kernel Dictionaries

In signal analysis and synthesis, linear approximation theory considers a linear decomposition of any given signal in a set of atoms, collected into a so-called dictionary. Relevant sparse representations are obtained by relaxing the orthogonality condition of the atoms, yielding overcomplete dictionaries with an extended number of atoms. More generally than the linear decomposition, overcomple...

متن کامل

Cortical thickness analysis in autism with heat kernel smoothing.

We present a novel data smoothing and analysis framework for cortical thickness data defined on the brain cortical manifold. Gaussian kernel smoothing, which weights neighboring observations according to their 3D Euclidean distance, has been widely used in 3D brain images to increase the signal-to-noise ratio. When the observations lie on a convoluted brain surface, however, it is more natural ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011